热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

Python正则表达式学习记录及常用方法

本文记录了学习Python正则表达式的过程,介绍了re模块的常用方法re.search,并解释了rawstring的作用。正则表达式是一种方便检查字符串匹配模式的工具,通过本文的学习可以掌握Python中使用正则表达式的基本方法。

在这里插入图片描述在这里插入图片描述学习记录,防止遗忘
在这里插入图片描述
正则表达式是一个特殊的字符序列,它能帮助你方便的检查一个字符串是否与某种模式匹配。

在这里插入图片描述
以上方法显的非常繁琐。。。

re 模块使 Python 语言拥有全部的正则表达式功能。

re.search 扫描整个字符串,匹配成功re.search方法返回一个匹配的对象,否则返回None。

re.search(pattern, string, flags=0)

在这里插入图片描述
在这里插入图片描述

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
The ’r‘ at the beginning of the pattern indicates that this is a rawstring. This means that Python interpreter shouldn’t try to interpret any special characters, and instead, should just pass the string to the function as is. In this example, there are no special characters. The rawstring and the normal string are exactly the same, but it’s a good idea to always use rawstrings for regular expressions in Python.

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
We can also pass additional options to the search function. For example, if we want our match to be case insensitive, we can do this by passing the re.IGNORECASE option.

We’ve seen by now how we can use a dot in our regular expressions as a special character that can match any character. In the regex world, this is known as a wildcard because it can match more than one character.
在这里插入图片描述
Sometimes we may want to match any characters that aren’t in a group. To do that, we use a circumflex inside the square brackets. For example, let’s create a search pattern that looks for any characters that’s not a letter.
在这里插入图片描述
If we want to match either one expression or another, we can use the pipe symbol to do that. This lets us list alternative options that can get matched. For example, we could have an expression that matches either the word cat or the word dog, like this.
在这里插入图片描述
RegEx concept, repeated matches. It’s quite common to see expressions that include a dot followed by a star. This means that it matches any character repeated as many times as possible including zero.
re* 匹配0个或多个的表达式
在这里插入图片描述
the Star takes as many characters as possible. In programming terms, we say that this behavior is greedy. It’s possible to modify the repetition qualifiers to make them less greedy. But we won’t get into that now.

Other implementations like the one used by Python or by the Egrep command include two additional repetition qualifiers plus and question mark, that can help us construct more complex expressions. Let’s check that out. The plus character matches one or more occurrences of the character that comes before it. So we had the pattern O plus L plus. Let’s check it against a few words.
In this case, there was one occurrence of each. In the match pattern shows us the shortest possible matching string.
re+ 匹配1个或多个的表达式
在这里插入图片描述
The question mark symbol is yet another multiplier. It means either zero or one occurrence of the character before it.
re? 匹配0个或1个由前面的正则表达式定义的片段,非贪婪方式
在这里插入图片描述
special characters:dot, star, plus, question mark, circumflex, dollar sign, and square brackets在这里插入图片描述
To match an actual dot, we need to use an Escape character, which in the case of regular expressions is a backslash character. So let’s add that to our pattern.在这里插入图片描述
It can get really confusing with backslashes since they’re also used to define some special string characters. We’ve called out, for example, that \n is a sequence using Python to indicate a new line, and \t does the same for tabs. When we see a pattern that includes a backslash, it could be escaping a special regex character or a special string character.
在这里插入图片描述
Using rawstrings, like we’ve been doing, helps avoid some of these possible confusion because the special characters won’t be interpreted when generating the string. They will only be interpreted when parsing the regular expression. On top of this, Python also uses the backslash for a few special sequences that we can use to represent predefined sets of characters. For example, \w matches any alphanumeric character including letters, numbers, and underscores. Let’s check out a couple of examples.
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
A great resource for testing out your regular expressions is a website called regex101.com. You can use this to try out your regexes, analyze each part of the expression, and figure out what’s up with them when they don’t work.
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
Python also offers numeric repetition qualifiers. These are written between curly brackets and can be one or two numbers specifying a range. For example, to match any string of exactly five letters, we can use an expression like this one:
在这里插入图片描述
Now we have an extra match for the word that’s actually longer. What if we wanted to match all the words that are exactly five letters long? We can do that using \b, which matches word limits at the beginning and end of the pattern, to indicate that we want full words, like this:
We can also have two numbers in the range. For example, if we wanted to match a range of five to ten letters or numbers, we could use an expression like this one:
在这里插入图片描述
在这里插入图片描述
Splitting and Replacing
在这里插入图片描述
Use a regular expression for searching in a plain string for replacing.
在这里插入图片描述
Use regular expressions for the replacing.
在这里插入图片描述
So once again we’d use parentheses to create capturing groups. In the first parameter, we’ve got an expression that contains the two groups that we want to match: one before the comma and one after the comma. We want to use a second parameter to replace the matching string. We use backslash two to indicate the second captured group followed by a space and backslash one to indicate the first captured group. When referring to captured groups, a backslash followed by a number indicates the corresponding captured group. This is a general notation for regular expressions, and it’s used by many tools that support regexes, not just Python.

Bonus:Regex Cross­word


推荐阅读
author-avatar
myq9395014
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有